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Can you tell the difference? 
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How about now? 
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The Transformers 

When good input turns bad 

<script> 

becomes 

<script> 
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Agenda 
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Unicode Transformations 

Agenda 

• Unicode crash course 

• Root Causes 

• Attack Vectors 

• Tools 

- Find Unicode issues in Web-testing 

- Visual Spoofing Detection 
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Unicode Transformations 

Agenda 

• Unicode crash course 
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Unicode Crash Course 

The Unicode Attack Surface 

• End users 

• Applications 

• Databases 

• Programming languages 

• Operating Systems 
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Unicode Crash Course 

Unthink it 
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Unicod 



A large and complex standard 

cocppojnts canonical 

encodings decomi 

• r f 

norpializaxiora 

binary properties 17 planes 
cast rropping private use xi 

conversion tables script blocks 
bi-directional properties escapii 



Tj*Jk£i»l 
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Unicode Crash Course 

Code pages and charsets 



Shift J is 

Gb2312 

ISCII 

Windows-1252 

ISO-8859-1 

EBCDIC 037 



WRONG 
WAY 
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Unicode Crash Course 

Ad Infinitum 

• Unicode can represent them all 

• ASCII range is preserved 

- U+0000 to U+007F are mapped to ASCII 
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Unicode Crash Course 

Code points 

• Unicode 5.1 uses a 21-bit scalar value with 
space for over 1,100,000 code points: 

U+0000 to U+10FFFF 
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Unicode Crash Course 

Code Points 



A = U+0041 



Every character has a unique number 
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Unicode Crash Course 



Category: Lu (Letter, Uppercase) 
ToLower: U+0061 



ToUpper: U+QQ41 \ 
Script: Basic Latin J 



A 



U+0041 



Latin capital letter A 
Decoction Type: none 
Mapping: none 



V 



Binary Properties: 
Hext Digit 
Alphabetic 
Lowercase 
ID Start... 



Black Hat USA - July 2009 



www.casabasecurity.com 



© 2009 Chris Weber 



Unicode Crash Course 



Category: LI (Letter, Lowercase) 



ToLower: U+017F 



ToUpper: U+QQ53 



Script: Latin Extended-A 



f 



U+017F 






Latin small letter long S 
Decomposition Type: <compat> 
Mapping: U+QQ73 



Binary Properties: 
Alphabetic 
Lowercase 
ID Start... 
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Unicode Crash Course 

Encodings 

UTF-8 

- variable width 1 to 4 bytes (used to be 6) 
UTF-16 

- Endianess 

- Variable width 2 or 4 bytes 

- Surrogate pairs! 
UTF-32 

- Endianess 

- Fixed width 4 bytes 

- Fixed mapping, no algorithms needed 
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Unicode Crash Course 

Encodings and Escape sequences 

U+FF21 FULLWIDTH LATIN CAPITAL LETTER A 

%EF%BC%A1 

&#xFF21; 

&#65313; 

\xEF\xBC\xAl 

\uFF21 
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Unicode Transformations 

Agenda 

• Unicode crash course 

• Root Causes 

• Attack Vectors 

• Tools 
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Unicode Transformations 

Agenda 

• Root Causes 

• Attack Vectors 
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Unicode Transformations 

Overview 



Root Causes 

- Visual Spoofing and IDN's 

- Best-fit mappings 

- Normalization 

- Overlong UTF-8 

- Over-consumption 

- Character substitution 

- Character deletion 

- Casing 

- Buffer overflows 

- Controlling Syntax 

- Charset transformations 

- Charset mismatches 
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Root Causes 

Visual Spoofing 

• Over 100,000 assigned characters 

• Many lookalikes within and across scripts 

AAAAAAAa**A 
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Attack Vectors 

IDN homograph attacks 

Some browsers allow .COM IDN's 
based on script family 
- (Latin has a big family) 
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Attack Vectors 

IDN homograph attacks 
Safari 



@ http://www.google.com/ 



(£9 IDN Visual Spoofing test 


+ [ D |I=IUBM1 
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Attack Vectors 

IDN homograph attacks 
Opera 



http ://www. google.com/ 



File Edit View Bookmarks Widgets Feeds Tools Help 




http :/A\' w :\- . g c c g I e. com/ 



SP ofl e 



Spoofle Search I'm Feelin Crappy 



Hi Mom! 



l"|E>r I 



■■> is - | ^- 100% 
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Attack Vectors 

IDN homograph attacks 

www . google . com is not www . google . com 



Latin 
U+0069 



Latin 
U+0261 



gg 
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Root Causes 

The state of International Domain Names 



ICANN guidelines v2.0 



- Inclusion-based 



- Script limitations 



- Character limitations 



Deny-all default seems to 
be the right concept. 



A script can cross many 
blocks. Even with limited 
script choices, there's 
plenty to choose from. 



Great for domain labels, 
but sub domain labels still 
open to punctuation and 
syntax spoofing. 
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Attack Vectors 

Visual spoofing Vectors 

Non-Unicode attacks 

Confusables 

Invisibles 

Problematic font-rendering 

Manipulating Combining Marks 

Bidi and syntax spoofing 
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Attack Vectors 

Non-Unicode homograph attacks 
rn can look like m in certain fonts 



www.mullets.com is not www.rnullets.com 



Latin 
U+006D 



Latin 
U+0073 U+006E 
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Attack Vectors 

Non-Unicode homograph attacks 

Are you using mono-width fonts? 

and 

1 and 1 
5 and S 



Black Hat USA - July 2009 www.casabasecurity.com © 2009 Chris Weber 



Attack Vectors 

Non-Unicode homograph attacks 
Classic long URL's 



http : //login . facebook. intvitation . videomessageid- 

h048892r3 9. sessionnfbid. com/home .htm? /disbursements/ 



Black Hat USA - July 2009 www.casabasecurity.com © 2009 Chris Weber 



Attack Vectors 

Single-script and The Confusables 

www . apple . com 

//All Latin using Latin small letter Alpha 'a' 

www . f acebook . com 

// Mixed Latin/Greek with lunate sigma symbol 'c' 

www. abc . com 
//All Cyrillic 'abc' 
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Attack Vectors 

IDN homograph attacks 
Browsers whitelist .ORG 



http://www. m ozi 1 1 a , org/ 




g test- Moiillc Fire-cx 



File Edit View History Bookmarks Tools Help 

~~~?' C C & ( U | http://www.mozflla.org/ ' 



3] |E-l G °°g fe P 




~l mozilla.org 

Projects DeveWi 



r l 




This is a Spoof 

Firefox & Thunderbird 

Looking for Firefox or Thunderbird? You'll find them and a whole 
lot more at Mozilla.com . 



' 
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Attack Vectors 

IDN homograph attacks 
Others don't necessarily but. 



' [J IDN Visual Spoofing test 

<- -») [Q | I *& http://www.rnozilla.org/ 



Q Customize Links 




C3 Other bookmarks 



~2 mozilla.org 

Projects DeveTrfpers Cc 




Firefox & Thunderbird 

Looking for Firefox or Thunderbird? You'll find them and a whole 



Ttjior%at Mo^|,la.com . ^->j « 

his is a Spoot 
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Attack Vectors 

IDN homograph attacks 

www.mozilla.org is notwww.mozflla.org 



Latin 
U+0069 



Latin 
U+OOED 
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Attack Vectors 

IDN Syntax Spoofing with / looka likes 



http://www.google.eom/path/f ile?.nottrusted.org 



FULLWIDTH SOLIDUS 
U+FFOF 



(This case doesn't work anymore) 
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Attack Vectors 

IDN Syntax Spoofing with / looka likes 



http://www.google.com/path/file.nottrusted.org 



SOUDUS 
U+002F 



(Normalized to a / U+002F) 
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Attack Vectors 

IDN Syntax Spoofing with / looka likes 



http://www.google.comypathyfile.nottrusted.org 



Katakana No 
U+FF89 



(However punctuation not required...) 
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Attack Vectors 

The Invisibles 



Uib, r 




Favorite Links 
More » 




Datempjrfrea^ Type 

.6/2009 12:29 PM File Folder 
2/16/2009 12:29 PM File Folder 



iLlC 



J 



MyFolder 
My[U+FEFF]Folder 
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Attack Vectors 

Visual Spoofing with Bidi Explicit Directional Overrides 




this-executes-[U+202E]txtexe 
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Root Causes 

Best-fit mappings 

Commonly occur in charset transformations and 
even innocuous API's 

Impact: Filter evasion, Enable code execution 

When a becomes s 

U+03C3 GREEK SMALL LETTER SIGMA 

When ' becomes ' 

U+2032 PRIME 
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Root Causes 

Guidance for Best- Fit mappings 

• Scrutinize character/charset manipulation API's 

• Use EncoderFallback With System. Text . Encoding 

• Set wc_no_best_fit_chars flag with 

WideCharToMultiByte () 

• Use Unicode end-to-end 



Black Hat USA - July 2009 www.casabasecurity.com © 2009 Chris Weber 



Case Study: Social Networking 

Best-fit mappings 

• A popular social networking site in 2008 

• Implemented complex filtering logic to 
prevent XSS 

- Attack: Filter evasion, code execution 

- Exploit: Bypass filtering logic with best-fit 
mappings to leverage cross-site scripting 

- Root Cause: best-fit mappings 
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Case Study: Social Networking 

Best-fit mappings 

-moz-binding ( ) 

was not allowed, but.... 

- [U+f f 4d] oz-binding ( ) 
would best-fit map! 
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Root Causes 

Normalization 

Normalizing strings after validation is dangerous 
Impact: Filter evasion, Enable code execution 
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Root Causes 

Normalization 



I becomes I + 
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Root Causes 

Normalization 

But are there dangerous characters? 

You bet... with NFKC and NFKD you could 
control HTML or other parsing 



< becomes < 
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Root Causes 

Normalization 



< becomes < 



U+FE64 



U+003C 



toNFKC("< script>") = "<script>' 
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Root Causes 

Guidance for Normalization 

Normalize strings before validation 
NFKC first defense against Visual spoofing 
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Root Causes 

Non-shortest form UTF-8 

Non-shortest or overlong UTF-8 

Impact: Filter evasion, Enable code execution 

Application gets %C0%A7 

OS/Framework sees %27 

Database gets 
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Root Causes 

Guidance for Non-shortest form UTF-8 

• Unicode specification forbids 

- Generation of non-shortest form 

- Interpretation of non-shortest form for BMP 

• Validate UTF-8 encoding (throw on error) 
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Attack Vectors 

Directory traversal 

How many ways can you say 



../ 
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Attack Vectors 



Normalization compatibility forms: 

U+2024 U+2024 U+FFOF 

%E2 %80 %A4 %E2 %80 %A4 %EF %83 %BF 

■ ■/ 



Best-fit mapping Windows-1252: 

U+FFOE U+FFOE U+2215 

%EF %BC %8E %EF %BC %8E %E2 %88 %95 




UTF-8: 

U+002E U+002E U+002F 

%2E %2E %2F 



../ 



UTF-8 overlong: 

U+002E U+002E U+002F 

%C0 %AE %C0 %AE %C0 %AF% 

../ 
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Root Causes 

Handling the Unexpected 

• Unassigned code points 

- U+2073 

• Illegal code points 

- Haifa surrogate pair 

• Code points with special meaning 
-U+FEFFistheBOM 

• Impact: Filter evasion, Enable code execution 
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Root Causes 

Handling the Unexpected: Over-consumption 

Over-consuming ill-formed byte sequences 

* Big problem with MBCS lead bytes 

<41 C2 3E 41> becomes 

<41 41> 
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Root Causes 

Handling the Unexpected: Over-consumption 



<img src="# [0xC2] "> "onerror="alert (1 ) "<br /> 

becomes 

<img src="#>" onerror="alert (1) "<br /> 
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Root Causes 

Handling the Unexpected: Character-substitution 

Correcting insecurely rather than failing 
- Substituting a ". or a '/' would be bad 
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Root Causes 

Handling the Unexpected: Character-deletion 
"deletion of noncharacters" (UTR-36) 




delete 
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Root Causes 

Handling the Unexpected: Character-deletion 



<scr [U+FEFF] ipt> becomes <script> 
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Root Causes 

Solutions for Handling the Unexpected 

• Fail or error 

• Use U+FFFD instead 

- A common alternative is '?', which can be safe 



Black Hat USA - July 2009 www.casabasecurity.com © 2009 Chris Weber 



Attack Vectors 

Filter evasion 

• Bypass filters, WAF's, NIDS, and validation 

• Exploit delivery techniques 

- E.g. Cross-site scripting (buffer overflow of the 
Web) 
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Case Study: Apple and Mozilla 

Safari and Firefox BOM consumption 

- Attack: Filter evasion, code execution 

- Exploit: Bypass filtering logic with specially crafted 
strings to leverage cross-site scripting 

- Root Cause: Character deletion 

<a href=" Java [U+FEFF] script : alert ( 'XSS' ) > 
Can be nastier: 

<a h [U+FEFF] ref=" Java [U+FEFF] script :al [U+FEFF] ert ( 'XSS' ) > 
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A Closer Look: The BOM 



Category: Cf [Other, Format] A 

Script: Common > 
Line Break: WJ [Word Joiner] ,,' 


BOM 

U+FEFF 


ZERO WIDTH NO-BREAK SPACE 
(BYTE ORDER MARK) 

/ . Binary Properties: 

Y^ Default Ignorable Code Point 

I UTF-8: EF BB BF 




i UTF-16LE: FF FE 




\ UTF-16BE: FE FF 
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Root Causes 

Casing 

• Attackers manipulate casing operations to 
inject otherwise prohibited characters 

• Casing can multiply the buffer sizes needed 

• Impact: Filter evasion, Enable code execution 
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Root Causes 

Casing 



\\ £ ff 



toLower PI") == "i 



toLower ("script") == "script' 
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Root Causes 

Casing 



len(x) != len (toLower (x) ) 
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Root Causes 

Guidance for Casing 

• Perform casing operations before validation 

• Leverage existing frameworks and API's 
- ICU, .Net 
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Root Causes 

Buffer Overflows 

• Incorrect assumptions about string sizes (chars 
vs. bytes) 

• Improper width calculations 

• Impact: Enable code execution 
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Root Causes 

Buffer Overflows 

Casing - maximum expansion factors 



Operation 


UTF Factor 


Sample 


Lower 


8 1.5 


A 


U+023A 


16, 32 1 


A 


U+0041 


Upper 


8, 16, 32 3 


l 


U+0390 



Source: Unicode Technical Report #36 
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Root Causes 

Buffer Overflows 

Normalization- maximum expansion factors 



Operation 


UTF 


Factor Sample 


NFC 


8 


3X D 


U+1D160 


16,32 


3X tf 


U+FB2C 


NFD 


8 


3X t 


U+0390 


16,32 


4X q 


U+1F82 


NFKC/NFKD 


8 


11X «UJ- 


U+FDFA 


16,32 


18X fL ** 



Source: Unicode Technical Report #36 
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Root Causes 

Guidance for Buffer Overflows 

• Know the difference between bytes and chars 

• Secure coding 

• Leverage existing frameworks and API's 
- ICU, .Net 
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Root Causes 

Controlling Syntax 

• White space and line breaks 

- E.g. when U+180E acts like U+0020 

• Quotation marks 

• Impact: Filter evasion, Enable code execution 
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Attacks and Exploits 

Controlling syntax 

• Manipulate HTML parsers and javascript 
interpreters 

• Control protocols 



Black Hat USA - July 2009 www.casabasecurity.com © 2009 Chris Weber 



Case Study: Opera 

• Unicode formatter characters exploited for 
XSS 

- Damage: Filter evasion, controlling syntax 

- Exploit: Bypass filtering logic with specially crafted 
characters to leverage cross-site scripting. 

- Root Cause: Interpreting "white space" 

- A problem with HTML 4.0 spec? 
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Case Study: Opera 



<a href =# [U+180E] onclick=alert ( ) > 
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Case Study: Opera 



Category: Zs [Separator, Space] A 

Script: Mongolian 

Line Break: GL [Non-breaking ("Glue")] 



MVS 

U+180E 



MONGOLIAN VOWEL SEPARATOR 

Binary Properties: 
A White Space 
Grapheme Base 
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Root Causes 

Guidance for Controlling Syntax 

• Question specifications 

• Be careful... 
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Root Causes 

Specifications 

1) Character stability 

- IDNA/Nameprep based on Unicode 3.2 

2) Designs 

- Specs are carefully designed but not always perfect 

• This could have been a problem: 

- "When designing a markup language or data protocol, the use of 
U+FEFF can be restricted to that of Byte Order Mark. In that case, 
any U+FEFF occurring in the middle of the file can be ignored, or 
treated as an error. " 

- HTML 4.01 

• Defines four whitespace characters and explicitly leaves 
handling other characters up to implementer. 
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Root Causes 

Charset Transformations 

• Converting between charsets is dangerous 

• Mapping tables and algorithms vary across 
platforms 

• Impact: Filter evasion, Enable code execution, 
Data-loss 
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Root Causes 

Guidance for Charset Transformations 

• Avoid if possible 

• Use Unicode as the broker 

• Beware the PUA mappings 

• Transform, case, and normalize prior to 
validation and redisplay 



Black Hat USA - July 2009 www.casabasecurity.com © 2009 Chris Weber 



Root Causes 

Charset Mismatches 

• Some charset identifiers are ill-defined 

• Vendor implementations vary 

• User-agents may sniff if confused 

• Attackers manipulate behavior 

• Impact: Filter evasion, Enable code execution 
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Root Causes 

Charset Mismatches 



Content-Type: charset=ISO-8859-l 



Attacker-controlled input 



<meta http-equiv="Content-Typ < e>Content= ,, text/html; 
charset=shiftJis"/> 
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Root Causes 

Guidance for Charset Mismatches 

• Force UTF-8 

• Error if uncertain 
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Unicode Transformations 

Agenda 

• Unicode crash course 

• Root Causes 

• Attack Vectors 

• Tools 
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Unicode Transformations 

Agenda 
• Tools 
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Tools 

• Watcher 

- Passive Web-app security testing and auditing 

• Unibomber 

- XSS autopwn testing tool 
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Tools 

Watcher -Some of the Passive Checks Included 

Unicode transformation hot-spots 

User-controlled HTML 

Cross-domain issues 

Insecure cookies 

Insecure HTTP/HTTPS transitions 

SSL protocol and certificate issues 

XSS hot-spots 

Flash issues 

Silverlight issues 

Information disclosure 
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Tools 



Watcher by Casaba Secuitv 



Alert Filter: Informational 



Tots Is [.Alerts. Individual Issues) 
High: 7 , 19 Low: , Information 



Severity 


Session ID 


Medium 


14 


Medium 


14 


Medium 


14 


Informational 


17 


High 


18 


High 


18 


Medium 


20 


Informational 


21 


Medium 


21 


Informational 


30 


Medium 


30 


High 


32 


High 


3S 



Type 

Set-Cookie HTTPOnh/ tribute Not Set 

Set-Cookie Secure .Attribute Not Set 

Set -Cookie Loosely Sec red Domain 

Charset not UTF-8 

Invalid Unicode ByteStream 

Null Bytes in ByteStream 

Flash allowScriptAccess Value 

Chareet not UTF-8 

Rash crossdomainjonl Insecure Domain Reference 

Charset not UTF-8 

Silverlight clientaccesspc cyxml Insecure Domain Refe'e-ce 

User Controllable Location Header :poe- -:e: -ecV 

User Controllable Charset 



URL 

www.nott rusted, 
www.nottrusted. 
www.nott rusted, 
www.nott rusted, 
www.nott rusted, 
www.nott rusted, 
www.nott rusted, 
www.nott rusted, 
www.nott rusted, 
www.nott rusted, 
www.nott rusted, 
www.nott rusted, 
www.nottrusted. 



;;■ - •:, 
.com/wati 
.com/wati 
::■- ■■•.=: 
.;:■-. '■'.■£: 

.;:-'. '.•:£-.■ 

.com/wati 
.com/wati 
.com/wati 
.com/wati 
.com/wati 
.com/wati 
.com/wati t 



'" 



3 



Clear Selected Results (All results if none selected) 



I--.-? ::..- :cce3.-;eS;-e?'- 

T 

Risk: High * 

An invalid UTF-S ByteStreai- rj ::e:e::ec for request: 

w ww , nottrusted ,:::■■-;.■?;:■" e-C "e :-: r '? s/Jnicode , php 

"-e^b or -z i-i-.e:s] ■.-.e-e ce-i^ec: 

1) An invalid 2 character UTF-S BvteS:-ea~ was 'bund at bvte position 4S1 
Invalid byte(s): C3 75 
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Tools 

Watcher - Web-app Security Testing and Auditing 



http://websecuritytool.codeplex.com 
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Tools 

Unibomber- runtime XSS testing tool 

• Deterministic testing 

• Auto-inject payloads 

• Unicode transformers 
-<>'", etc. 

• Detect transformations and encoding 
hotspots 
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Thank you! 




Casaba Security 
www.casabasecurity.com 

Chris Weber 

Blog: www.lookout.net 

Email: chris@casabasecurity.com 

Linkedln: http://www.linkedin.com/in/chrisweber 



